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Abstract — We study the problem of storing a data object in a 
set of data nodes that fail independently with given probabilities. 
Our problem is a natural generalization of a homogenous storage 
allocation problem where all the nodes had the same reliability 
and is naturally motivated for peer-to-peer and cloud storage 
systems with different types of nodes. Assuming optimal erasure 
coding (MDS), the goal is to find a storage allocation (i.e, how 
much to store in each node) to maximize the probability of 
successful recovery. This problem turns out to be a challenging 
combinatorial optimization problem. In this work we introduce 
an approximation framework based on large deviation inequal- 
ities and convex optimization. We propose two approximation 
algorithms and study the asymptotic performance of the resulting 
allocations. SUBMITTED TO ISIT 2012. 

I. Introduction 

We are interested in heterogenous storage systems where 
storage nodes have different reliability parameters. This prob- 
lem is relevant for heterogenous peer-to-peer storage networks 
and cloud storage systems that use multiple types of storage 
devices, e.g. solid state drives along with standard hard disks. 
We model this problem by considering n storage nodes and 
a data collector that accesses a random subset r of them. 
The probability distribution of r C {1, ... ,71} models random 
node failures and we assume that node i fails independently 
with probability 1 — p^. The probability of a set r of nodes 
being accessed is therefore: 



(r)=n^n( 1 -^)- 



(i) 



iEr j^r 



Assume now that we have a single data file of unit size 
that we wish to code and store over these nodes to maximize 
the probability of recovery after a random set of nodes fail. 
The problem becomes trivial if we do not put a constraint 
on the maximum size T of coded data and hence, we will 
work with a maximum storage budget of size T < n: 
If Xi is the amount of coded data stored in node i, then 
S"=i Xi — T- We further assume that our file is optimally 
coded, in the sense that successful recovery occurs whenever 
the total amount of data accessed by the data collector is at 
least the size of the original file. This is possible in practice 
when we use Maximum Distance Separable (MDS) codes 
[UJ. The probability of successful recovery for an allocation 
(xi, . . . ,x n ) can be written as 



P, 



rC{l,...,n} 



1{5>>i} 



where !{•} is the indicator function. 1{S} — 1 if the statement 
S is true and zero otherwise. 

A more concrete way to see this problem is by introducing 
a lj ~ Bernoulli(pi) random variable for each storage node: 
Yi = 1 when node i is accessed by the data collector and 
Yi = when node i has failed. Define the random variable 



(2) 



where Xi is the amount of data stored in node i. Then, 
obviously, we have P s = ¥[Z > 1]. 

Our goal is to find a storage allocation (x±, . . . ,x n ), that 
maximizes the probability of successful recovery, or equiva- 
lently, minimizes the probability of failure, ¥[Z < 1]. 

II. Optimization Problem 

Put in optimization form, we would like to find a solution 
to the following problem. 

Ql : minimize V P(r)l(Vi,<l} 

Xi L J 

rC{l,...,n} iGr 
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subject to: Xi <T 

i=l 

Xi > 0, i = 1, . . . , n. 

Authors in [1] consider a special case of problem Ql in 
which pi = p, Even in this symmetric case the problem 
appears to be very difficult to solve due to its non-convex 
and combinatorial nature. In fact, even for a given allocation 
{xi} and parameter p, computing the objective function is 
computationally intractable (#P-hard , See |TJJ). 

A very interesting observation about this problem follows 
directly from Markov's Inequality: V[Z > 1] < E[Z] = pT. If 
pT < 1, then the probability of successful recovery is bounded 
away from 1. This has motivated the definition of a region of 
parameters for which high probability of recovery is possible: 
Rhp — {{p, T) : pT > 1}. The budget T should be more than 
1/p if we want to aim for high reliability and the authors in 
UJ showed that in the above region of parameters, maximally 
spreading the budget to all nodes (i.e, Xi — T/n, Vi) is an 
asymptotically optimal allocation as n — » 00. 

In the general case, when the node access probabilities, pi, 
are not equal, one could follow similar steps to characterize 



a region of high probability of recovery. Markov's Inequality 
yields: 

n 

F[Z > 1] < E[Z] = XiPi = P Tx 



where p = [pi,p 2 , ■ ■ ■ ,p n ] T and x = [x!,x 2 , ■ ■ ■ ,x n ] T . If we 
don't want F[Z > 1] to be bounded away from 1 we have 
to require now that p T x > 1. We see that in this case, high 
reliability is not a matter of sufficient budget, as it depends on 
the allocation x itself. 

Let S(p, T) = {x e E" : p T x > 1, l T x < T} be the set 
of all allocations x with a given budget constraint T that satisfy 
p T x > 1 for a given p. We call these allocations reliable for 
a system with parameters p, T, in the sense that the resulting 
probability of successful recovery is not bounded away from 1 . 
Then the region of high probability of recovery can be defined 
as the region of parameters p, T, such that the set S(p, T) is 
non-empty. 



n HP = {(p,T) G 



S(p,T)^0} 



This generalizes the region described in (TJ. If all p^s are 
equal then the set 5(p, T) is non-empty when p T x = pT > 1. 
In the general case, the minimum budget such that S(p,T) 
is non-empty is T = l/p max , with p max = max{p;}, and 
S(p,l/p max ) contains only one allocation x -i : Xj = 
—L- , j = arg max^pj, x l = , Vi ^ j. 

Even though IZhp provides a lower bound on the minimum 
budget T required to allocate for high reliability, it doesn't 
provide any insights on how to design allocations that achieve 
high probability of recovery in a distributed storage system. 
This motivates us to move one step further and define a region 
of e-optimal allocations in the next section. 

III. The region of e-optimal allocations 

We say that an allocation (xi,X2, ■ ■ ■ , x n ) is e-optimal if the 
corresponding probability of successful recovery, F[Z > 1], is 
greater than 1 — e. 

Let £ n (p, T, e) = { x G R% : F[Z < 1] < e , l T x < T } 
be the set of all e-optimal allocations. Note that if we could 
efficiently characterize this set for all problem parameters, we 
would be able to solve problem Ql exactly: Find the smallest 
e such that £ n (p,T, e) is non-empty. 

In this section we will derive a sufficient condition for an 
allocation to be e-optimal and provide an efficient character- 
ization for a region H n Q £n(p, T, e). We begin with a very 
useful lemma. 

Lemma 1. (Hoeffding's Inequality ft2§, fi3§) 
Consider the random variable W — X)"=i Vi> where Vi are 
independent almost surely bounded random variables with 
F{Vi G [fliAD = 1- Then, 



P W < E[W] 
for any S > 0. 



nS 



< exp 
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We can use Lemma [T] to upper bound the probability of 
failure, F[Z < 1] < F\Z < 1], for an arbitrary allocation, 
since Z = Yli=i x iYi can seen as the sum of n independent 
almost surely bounded random variables Vi = X{Yi, with 
F(Vi £ [0, Xi]) = 1. Let S = ( Y^i=i x iVi ~ l) l n an d require 
5 > J2i=i x iPi > !• Lemma TTTyields: 



F\Z < 1] < exp { - 



2i=l x i 



, y^Xjffj > 1. 



(3) 

Notice that the constraint Y^i=i x iPi > 1 requires the alloca- 
tion (xx, x%, . . . , x n ) to be reliable and 5(p, T) ^ 0. 

In view of the above, a sufficient condition for a strictly 
reliable allocation to be e-optimal is the following. 



exp ■ 



< e 



(4) 
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p T x > 1 



We say that all allocations satisfying the above equation are 
Hoeffding e-optimal, due to the use of Hoeffding's Inequality 
in Lemma [T] 

Definition 1. "The Region of Hoeffding e-optimal allocations" 



H n (p, T, e) = < x G K™ : p T x > 1, l T x < T, 



'lnl/e rp 
|xi|.,\/ p x 1 



(5) 



The above region is strictly smaller £ n (p, T, e) for any finite 
n, because the bound in ([3]) is not generally tight. However, 
H n (p,T,e) is a convex set: Equation (j4| can be seen as a 
second order cone constraint on the allocation x G K™ . 

Theorem 1. The region of Hoeffding e-optimal allocations 
H n (p 1 T, e) is convex in x. 

This interesting result allows us to formulate and efficiently 
solve optimization problems over H n (p,T, e). Finding the 
smallest e* such that H n (p,T,e) is non-empty will produce 
an e* -optimal solution to problem Ql. 

A. Hoeffding Approximation of Ql 

If we fix p, T, n as the problem parameters, then the 
following optimization problem can be solved efficiently, to 
any desired accuracy 1/a, by solving a sequence of O(loga) 
convex feasibility problems (bisection on e). 



HI 



mm e 

x.e 



s.t.: xGK„(p,T,e) 



We will see next that if T is sufficiently large, e* goes to 
zero exponentially fast as n grows, and hence the solution to 
the aforementioned problem is asymptotically optimal. 



B. Maximal Spreading Allocations and the Asymptotic Opti- 
mally of HI 

First, we will focus on maximal spreading allocations, xS, = 
{x g E" : Xi = T/n }, and derive their asymptotic optimality 
for Ql, in the sense that V[Z < 1] — s- 0, as n -> oo. Let 
p — - Y^i=i Pi ' 7e tne average access probability across all 
nodes. We have the following lemma. 

Lemma 2. 7f T > l/p, for any e > 0, 3n e : x^ g H n (p, T, e), 
/or all n > n e . 

Proof: This follows directly from the definition of 
^n(Pj T, e): n e = 2(p— i/t) 2 ■ ' 
The above lemma establishes the asymptotic optimality of 
maximal spreading allocations through the following corollary. 

Corollary 1. The probability of failed recovery, P e = V[Z < 
1], for a maximal spreading allocation is P e < ^ 2n ^P^' v l T ) . 
When T > l/p, P e — > 0, as n — > oo. 

The fact that H n (p,T, e) contains maximal spreading allo- 
cations for T > l/p, provides a sufficient condition on the 
asymptotic optimality of HI. 

Theorem 2. Let e* be the optimal value of HI. IfT> l/p, 
then e* = 0(exp(—n)). 



Proof: Let T > l/p and consider the maximal spreading 

s the minimum e such 

e _ 2 „(p-l/T) 2 5 an( j smce 



allocation xj. Then, e* < e s , where e s is the minimum e such 



that xj g -H„(p,T,e). That is e s 
T > l/p, e* < e s = 0(ea;p(-n)). 



IV. Chernoff Relaxation 

In this section we take a different approach to obtain a 
tractable convex relaxation for Ql by minimizing an appro- 
priate Chernoff upper bound. 

A. Upper Bounding the Objective Function 

Lemma 3. (Upper Bound) Let Z = ^^=1 x i^i< x i ^ 0> ~ 
bernoulli(pi) and t > 0. 77;e probability of failed recovery, 
< 1], « upper bounded by 



V[Z < 1] < ,g t (x) = ^ P(r) exp J -i ( - 1 



rC{l,...,n} 



Proof: For any i > we have: 

P[Z < 1] < P[Z < 1] = P [e"* z > e"*] 



< e*E[e"* z ] = e*E 

2=1 

n 

= e*JJ (l -pi +Pie 

i=l 



(6) 
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^ P(r) exp[-d^^)| 
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2 P(r) expj-W^Xi-l j I (7) 



9t( x ) 



Note that g t (x) is a weighted sum of convex functions with 
linear arguments, and hence convex in x. Equation (j7]i makes 
the convex relaxation of the objective function apparent: 



1 



|x < a j 



< e 



-t {x— a) 



, for any t > 0. 



B. 77ie Relaxed Optimization Problem 

Before we move forward and state the relaxed optimization 
problem, we take a closer look at the constraint set S = {x€ 
K" : l T x < T} of the original problem Ql. From a practical 
perspective, it should be wasteful to allocate more than one 
unit of data (filesize) on a single node. If the node survives, 
then the data collector can always recover the file using only 
one unit of data and hence any additional storage does not 
help. Also, an allocation using less than the available budget 
cannot have larger probability of successful recovery. 

In the following lemma, we show that it is sufficient to 
consider allocations with xi g [0, 1] and Yn=i Xi = 



l T x 



{X g If 

'E?=i^<i] < 



Lemma 4. For any x g S, 3x' g 5" 
T, Xi < 1, i = l,...,n} such that 

P E?=i *i Y i < !]• 

Proof: See the long version of this paper H. ■ 
The relaxed optimization problem can be formulated as 
follows. 



Rl : 



minimize 



subject to: 



n 

zZ x * = T 



Xi g [0,1], i = 1, . . .,n. 

Note that, in general, one would like to minimize 
inf f >o{(?((x)} instead of <?t(x) for some i > 0. However, 
for now, we will let t be a free parameter and carry on with 
the optimization. 

The important drawback of the above formulation hides 
in the objective function: Although convex, g't(x) has an 
exponentially long description in the number of storage nodes: 
The sum is still over all subsets r C {1, . . . , n}. This can be 
circumvented if we consider minimizing log g t (x) instead of 
<7t(x) over the same set. 

Lemma 5. log(?t(x) is convex in x. 

Proof: See the long version of this paper [4]. 



Lemma 6. For any t > 



arg min g t (x) = arg min > log 1 H e 

xeS x£S^ \ 1 — pi 

where S = {x G R™ : l T x < T, x X 1}. 

Proof: Let x* = arg min xe s g t (x) . Then <7t(x*) < 
<?t(x), Vx G 5. Taking the logarithm on both sides pre- 
serves the inequality since log(-) is strictly increasing. Hence, 
logg t (x*) < loggt(x), Vx G S and subtracting t + 
^" =1 log(l — Pi) from both sides yields the desired result 
and completes the proof. ■ 
In view of Lemmas [5] and [6] we can solve Rl through the 
following equivalent optimization problem. 



R2 : minimize 



subject to: Xi — T 



Xi G [0, 1], i = 1, . . . , n. 

R2 is a convex separable optimization problem with poly- 
nomial size description and in terms of complexity, it is "not 
much harder" than linear programming [5 1. One can solve such 
problems numerically in a very efficient way using standard, 
"off-the-shelf" algorithms and optimization packages such as 
CVX El, Q. 

C. Insights from Optimality Conditions for R2 

Here, we move one step further and take the KKT conditions 
for R2 in order to take a closer look at the structure of the 
optimal solutions. Let = jz^r- 

The Lagrangian for R2 is: 



L(x,u,v,A) = log (l + Tie~ tXi ) + A I > r, - T 
i=i 

n 

- ^2 UiXi + ^ V i( X i ~ !) 
i=l i=l 

where A G R, u, v G R™ are the corresponding Lagrange 
multipliers. The gradient is given by V a;i L(x, u, v, A) = 

— +A— Ui+Vi , and the KKT necessary and sufficient 



rd 



ri + e 

conditions for optimality yield: 



rd 



n + e" 



— + A - Ui + Vi = , Vi 



i=i 

< x*, < 1 , Vi 



(8) 



(9) 



(10) 

(11) 
(12) 



A G R, Vi,Ui > 0, Vi 
Vi(xi — 1)=0, u i x l = 0, \/i 

Using the results from [8], the optimal solution to R2 is 
given by 



if 



< A* 



if A* < -3* 



(13) 



where A* is chosen such that Eq.d9]l is satisfied, i.e, 



1=1 



i V A 



hflk/ 

^ ~ e* + 1 

■i— 1 v 



T (14) 



Numerically, A* can be computed via an iterative 0(n 2 ) 
algorithm described in 0, and hence this approach gives an 
even more efficient way to solve R2. 

However, the most important aspect of the above result is 
that we can use equations (13) , ( fT4) to obtain closed form 
solutions for a certain region of problem parameters and 
analyze the performance of the resulting allocations. 

D. The choice of parameter t > 



It is clear that the optimal solution to R2 depends on our 
choice of t > 0. For example, J+ r . — > 0, — > 00, as 
( 00 and x* = lim t~ 1 log (r^i/A* — ri), Vi. Equation 



(14 1 yields x* — ^ , Vi and hence the maximal spreading 
allocation becomes optimal for R2 as t — > 00. Even though 
this motivates the choice of maximal spreading allocations 
as approximate "one-shot" solutions for the original problem 
Ql, explicitly tuning the parameter t can provide significantly 
better approximations. 

In order to obtain the tightest bound from Lemma [3] we 
have to jointly minimize the objective in R2 with respect to 
t > and x. Towards this end, one can iteratively optimize 
R2 by fixing the value of one variable (t or x) at each 
step and minimizing over the other. After each iteration the 
objective function decreases and hence the above procedure 
converges to a (possibly local) minimum. The above algorithm 
iteratively tunes the Chernoff bound introduced in this section 
and produces a minimizing allocation that can serve as an 
approximate solution to the original problem Ql. 

For analytic purposes though, we can choose a value for t as 
follows. Recall from Lemma|3]that ¥[Z < 1] < <?t(x) for any 
t > 0. After taking logarithms, we would like to find a value 
for t > that minimizes b(t) = t + Y^7=i l°g(l + rie~ tXi ). 
Notice that b(t) is a convex function of t, with b(t) > 0,\/t > 
0, 6(0) = Yn=i l°g(l + r i) an( l li m t->oo b(t) = 00. The slope 
of b(t) at zero is fc'(0) = 1 - *g = 1 - EtiP^ 

which is negative if the allocation is reliable. 

When t is large, log(l + rie~~ tXi ) « 0, whereas for small 
values of t, \og(l + rie~ tXi ) « — txi + logri and hence b(t) k, 
t + J27=l max{-<a; i + log n, 0} > t + max{- J2'i=i tx i + 
logj'i, 0}. One way to choose t that does not depend on x^ is 



to make - Yh=i tx i + 1°S 1 







E. A closed-form allocation: x^ 

In view of the above results we provide here a closed form 
allocation (each X{ is given as a function of p and T) that can 
be used to study the asymptotic performance of R2 and serve 
as a better "one-shot" approximate solution to Ql. 

Let £(■) be a shorthand notation for the sample average 
such that £f(x) = i X)™=i f( x i)> m order to simplify the 
expressions. For the above choice of t — ;f X)™=i 1°§ r » = 



n£\ogr /T, equation (13i becomes 



if 



n£ log r 



< X* 



T 

^ ^* <- rjnSlogr/T 



T 

n£l 



— logf 



nr;£log r 



otherwise 



Lemma 7. /f > \, Mi and T < 



then x* — 



n£ log r 



\ogn, Mi. 



log r 
log r maa; 



(15) 

max{ri}, 



Proof: Assume that A* e ( e ? £ "f g '°/ 
Then from Eq.{l4|). 



r/T n n£\ogr 

T +n ' 1 + 7'; T 

A = 2^ and x * = logr . A * 



2T 



n£ log r 



is indeed in the required interval if n£ ^ r < 



n£log r 



> 1, V* => Pi > 1/2, Vz and < ^igfr/r ^ v; 



2T 
n£ log r 



+n T 

inl 

£logr/T +r . 



V/ 



=> n < e ™ fl °s7T, Mi T < 

Clearly, when all > 1/2, xj: ^ = n£ f oar logr^, Mi, is a 
feasible suboptimal allocation for Ql. It is also suboptimal for 
R2 in general, since solving R2 via the proposed algorithms 
can only achieve a smaller probability of failed recovery. We 
have P e {Ql} < P e {R2} < P {^7=1 lo S r i Y i < l}- 

In the following lemma we give an upper bound on the prob- 
ability of failed recovery for an d establish its asymptotic 
optimality. 



Lemma 8. Ifpt > |, Mi and T > 



£logr 
£p log i 



, the allocation x^ 



logTi, Vi, is strictly reliable, and the probability 
?[Z < 1], is upper bounded by 



1 n£ log r 

of failed recovery, P ( 



P e < exp < 



-2n 



£p log r 



£log r 
T 



flog 2 r 



£log r 

and hence, when T > — — , P e — > 0, as n — > oo. 



£p log r 



Proof: 



The proof follows directly from Lemma[T]and Equation pj. 

■ 

Notice that x^ is reliable for values of T for which a 
maximal spreading allocation xS; is not, since 4 > P ° gr , 
and hence its probability of failed recovery P e goes to zero 
exponentially fast for smaller values of T. 

V. Numerical Experiments 

In this section we evaluate the performance of the proposed 
approximate distributed storage allocations in terms of their 
probability of failed recovery and plot the corresponding 
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Fig. 1, Performance of the proposed approximate distributed storage 
allocations and their corresponding upper bounds for a system with n = 100 
nodes and pi ~ W(0.5, 1). 



bounds. In our simulations we consider an ensemble of dis- 
tributed storage systems with n = 100 nodes, in which the 
corresponding access probabilities, pi ~ U(0.5, 1), are drawn 
uniformly at random from the interval (0.5, 1). 

We consider the following allocations. 1) Maximal spread- 
ing: Xi = — , Mi. 2) Chernoff closed-form: x,i = 
(T/n£ logr) log r i: Mi. 3) Hoeffding e-optimal: obtained by 
solving HI. 4) Chernoff iterative: obtained by solving R2 
and iteratively tuning the parameter t. 

FigJT] shows, in solid lines, the ensemble average probability 
of failed recovery of each allocation, P E™=i x iYi < 1]> ver " 
sus the maximum available budget T. In dashed lines, Fig[T] 
plots the corresponding bounds on P e obtained from Corollary 
[T] Lemma [8] and the objective functions of HI, Rl. 

References 

[1] D. Leong, A. Dimakis, and T. Ho. Distributed storage allocations. CoRR, 

abs/101 1.5287, 2010. 
[2] W. Hoeffding. Probability inequalities for sums of bounded random 

variables. Journal of the American Stat. Association, 58(301):13— 30, 

March 1963. 

[3] M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized 
Algorithms and Probabilistic Analysis. Cambridge University Press, New 
York, NY, USA, 2005. 

[4] V. Ntranos, G. Caire, and A. Dimakis. Allocations for heterogenous 
distributed storage (long version), http://www-scf.usc.edu/~ntranos/docs/ 
HDS-long.pdf January 2012. 

[5] D. S. Hochbaum and J. George Shanthikumar. Convex separable opti- 
mization is not much harder than linear optimization. J. ACM, 37:843- 
862, October 1990. 

[6] M. Grant and S. Boyd. CVX: Matlab software for disciplined convex 
programming, version 1.21. http://cvxr.com/cvx, April 2011. 

[7] M. Grant and S. Boyd. Graph implementations for nonsmooth convex 
programs. In V. Blondel, S. Boyd, and H. Kimura, editors, Recent 
Advances in Learning and Control, Lecture Notes in Control and In- 
formation Sciences, pages 95-110. Springer- Verlag Limited, 2008. 

[8] S. M. Stefanov. Convex separable minimization subject to bounded 
variables. Comp. Optimization and Applications, 18, 2001. 



